Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
The optimal transport barycenter (a.k.a. Wasserstein barycenter) is a fundamental notion of averaging that extends from the Euclidean space to the Wasserstein space of probability distributions. Computation of the unregularized barycenter for discretized probability distributions on point clouds is a challenging task when the domain dimension d>1. Most practical algorithms for approximating the barycenter problem are based on entropic regularization. In this paper, we introduce a nearly linear time O(mlogm) and linear space complexity O(m) primal-dual algorithm, the Wasserstein-Descent ℍ˙1-Ascent (WDHA) algorithm, for computing the exact barycenter when the input probability density functions are discretized on an m-point grid. The key success of the WDHA algorithm hinges on alternating between two different yet closely related Wasserstein and Sobolev optimization geometries for the primal barycenter and dual Kantorovich potential subproblems. Under reasonable assumptions, we establish the convergence rate and iteration complexity of WDHA to its stationary point when the step size is appropriately chosen. Superior computational efficacy, scalability, and accuracy over the existing Sinkhorn-type algorithms are demonstrated on high-resolution (e.g., 1024×1024 images) 2D synthetic and real data.more » « lessFree, publicly-accessible full text available July 13, 2026
-
Abstract With the advance of science and technology, more and more data are collected in the form of functions. A fundamental question for a pair of random functions is to test whether they are independent. This problem becomes quite challenging when the random trajectories are sampled irregularly and sparsely for each subject. In other words, each random function is only sampled at a few time-points, and these time-points vary with subjects. Furthermore, the observed data may contain noise. To the best of our knowledge, there exists no consistent test in the literature to test the independence of sparsely observed functional data. We show in this work that testing pointwise independence simultaneously is feasible. The test statistics are constructed by integrating pointwise distance covariances (Székely et al., 2007) and are shown to converge, at a certain rate, to their corresponding population counterparts, which characterize the simultaneous pointwise independence of two random functions. The performance of the proposed methods is further verified by Monte Carlo simulations and analysis of real data.more » « less
-
Abstract Series of univariate distributions indexed by equally spaced time points are ubiquitous in applications and their analysis constitutes one of the challenges of the emerging field of distributional data analysis. To quantify such distributional time series, we propose a class of intrinsic autoregressive models that operate in the space of optimal transport maps. The autoregressive transport models that we introduce here are based on regressing optimal transport maps on each other, where predictors can be transport maps from an overall barycenter to a current distribution or transport maps between past consecutive distributions of the distributional time series. Autoregressive transport models and their associated distributional regression models specify the link between predictor and response transport maps by moving along geodesics in Wasserstein space. These models emerge as natural extensions of the classical autoregressive models in Euclidean space. Unique stationary solutions of autoregressive transport models are shown to exist under a geometric moment contraction condition of Wu & Shao [(2004) Limit theorems for iterated random functions. Journal of Applied Probability 41, 425–436)], using properties of iterated random functions. We also discuss an extension to a varying coefficient model for first-order autoregressive transport models. In addition to simulations, the proposed models are illustrated with distributional time series of house prices across U.S. counties and annual summer temperature distributions.more » « less
-
Abstract Testing the homogeneity between two samples of functional data is an important task. While this is feasible for intensely measured functional data, we explain why it is challenging for sparsely measured functional data and show what can be done for such data. In particular, we show that testing the marginal homogeneity based on point-wise distributions is feasible under some mild constraints and propose a new two-sample statistic that works well with both intensively and sparsely measured functional data. The proposed test statistic is formulated upon energy distance, and the convergence rate of the test statistic to its population version is derived along with the consistency of the associated permutation test. The aptness of our method is demonstrated on both synthetic and real data sets.more » « less
-
Abstract The maturation of regional brain volumes from birth to preadolescence is a critical developmental process that underlies emerging brain structural connectivity and function. Regulated by genes and environment, the coordinated growth of different brain regions plays an important role in cognitive development. Current knowledge about structural network evolution is limited, partly due to the sparse and irregular nature of most longitudinal neuroimaging data. In particular, it is unknown how factors such as mother’s education or sex of the child impact the structural network evolution. To address this issue, we propose a method to construct evolving structural networks and study how the evolving connections among brain regions as reflected at the network level are related to maternal education and biological sex of the child and also how they are associated with cognitive development. Our methodology is based on applying local Fréchet regression to longitudinal neuroimaging data acquired from the RESONANCE cohort, a cohort of healthy children (245 females and 309 males) ranging in age from 9 weeks to 10 years. Our findings reveal that sustained highly coordinated volume growth across brain regions is associated with lower maternal education and lower cognitive development. This suggests that higher neurocognitive performance levels in children are associated with increased variability of regional growth patterns as children age.more » « less
-
Abstract Brain growth in early childhood is reflected in the evolution of proportional cerebrospinal fluid volumes (pCSF), grey matter (pGM), and white matter (pWM). We study brain development as reflected in the relative fractions of these three tissues for a cohort of 388 children that were longitudinally followed between the ages of 18 and 96 months. We introduce statistical methodology (Riemannian Principal Analysis through Conditional Expectation, RPACE) that addresses major challenges that are of general interest for the analysis of longitudinal neuroimaging data, including the sparsity of the longitudinal observations over time and the compositional structure of the relative brain volumes. Applying the RPACE methodology, we find that longitudinal growth as reflected by tissue composition differs significantly for children of mothers with higher and lower maternal education levels.more » « less
An official website of the United States government
